This work provides a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching. While this dynamic program has historically been considered intractable, our results show that several policy learning approaches are competitive with or outperform classical methods. In order to train these algorithms, we develop novel techniques to convert historical data into a simulator. On the theoretical side, we present learnability results on a subclass of inventory control problems, where we provide a provable reduction of the reinforcement learning problem to that of supervised learning. On the algorithmic side, we present a model-based reinforcement learning procedure (Direct Backprop) to solve the periodic review inventory control problem by constructing a differentiable simulator. Under a variety of metrics Direct Backprop outperforms model-free RL and newsvendor baselines, in both simulations and real-world deployments.
translated by 谷歌翻译
可靠的点云数据对于机器人技术和自主驾驶应用程序中的感知任务\ textit {efextit {e.g。}至关重要。不利的天气会导致特定类型的噪声检测和范围(LIDAR)传感器数据,从而大大降低了点云的质量。为了解决这个问题,这封信提出了一种新颖的点云不利天气,使深度学习算法(4Denoisenet)。我们的算法利用了时间维度,与文献中深度学习不利的天气变质方法不同。与以前的工作相比,它的交集比联合度量的交点更好10 \%,并且在计算上更有效。这些结果是在我们的新型Snowkitti数据集上实现的,该数据集具有40000多个不良天气注释点云。此外,对加拿大不利驾驶条件数据集的强烈定性结果表明,对域移动和不同传感器内在的可推广性良好。
translated by 谷歌翻译
人们对人类情感状态的稀疏代表性格式的需求日益增长,这些格式可以在有限的计算记忆资源的情况下使用。我们探讨了在潜在矢量空间中代表神经数据对情绪刺激的响应是否可以用于预测情绪状态,并生成参与者和/或情绪特定于情绪的合成EEG数据。我们提出了一个有条件的基于变异自动编码器的框架EEG2VEC,以从脑电图数据中学习生成歧视性表示。关于情感脑电图记录数据集的实验结果表明,我们的模型适用于无监督的脑电图建模,基于潜在表示的三个不同情绪类别(正,中性,负)的分类,可实现68.49%的稳健性能,并产生的合成eeg序列共同存在于真实的脑电图数据输入到特别重建低频信号组件。我们的工作推进了情感脑电图表示可以在例如生成人工(标签)训练数据或减轻手动功能提取的领域,并为记忆约束的边缘计算应用程序提供效率。
translated by 谷歌翻译